Info file internals, produced by texinfo-format-buffer   -*-Text-*-
from file internals.texinfo


This file documents the internals of the GNU compiler.

Copyright (C) 1987 Richard M. Stallman.

Permission is granted to make and distribute verbatim copies of
this manual provided the copyright notice and this permission notice
are preserved on all copies.

Permission is granted to copy and distribute modified versions of this
manual under the conditions for verbatim copying, provided also that the
section entitled "GNU CC General Public License" is included exactly as
in the original, and provided that the entire resulting derived work is
distributed under the terms of a permission notice identical to this one.

Permission is granted to copy and distribute translations of this manual
into another language, under the above conditions for modified versions,
except that the section entitled "GNU CC General Public License" may be
included in a translation approved by the author instead of in the original
English.





File: internals  Node: Comparisons, Prev: Arithmetic, Up: RTL, Next: Bit Fields

Comparison Operations
=====================

Comparison operators test a relation on two operands and are considered to
represent the value 1 if the relation holds, or zero if it does not.  The
mode of the comparison is determined by the operands; they must both be
valid for a common machine mode.  A comparison with both operands constant
would be invalid as the machine mode could not be deduced from it, but such
a comparison should never exist in rtl due to constant folding.

Inequality comparisons come in two flavors, signed and unsigned.  Thus,
there are distinct expression codes `GT' and `GTU' for signed and
unsigned greater-than.  These can produce different results for the same
pair of integer values: for example, 1 is signed greater-than -1 but not
unsigned greater-than, because -1 when regarded as unsigned is actually
0xffffffff which is greater than 1.

The signed comparisons are also used for floating point values.  Floating
point comparisons are distinguished by the machine modes of the operands.

The comparison operators may be used to compare the condition codes
`(cc0)' against zero, as in `(eq (cc0) (const_int 0))'.
Such a construct actually refers to the result of the preceding
instruction in which the condition codes were set.  The above
example stands for 1 if the condition codes were set to say
"zero" or "equal", 0 otherwise.  Although the same comparison
operators are used for this as may be used in other contexts
on actual data, no confusion can result since the machine description
would never allow both kinds of uses in the same context.

`(eq X Y)'     
     1 if the values represented by X and Y are equal,
     otherwise 0.
     
`(ne X Y)'     
     1 if the values represented by X and Y are not equal,
     otherwise 0.
     
`(gt X Y)'     
     1 if the X is greater than Y.  If they are fixed-point,
     the comparison is done in a signed sense.
     
`(gtu X Y)'     
     Like `gt' but does unsigned comparison, on fixed-point numbers only.
     
`(lt X Y)'     
`(ltu X Y)'     
     Like `gt' and `gtu' but test for "less than".
     
`(ge X Y)'     
`(geu X Y)'     
     Like `gt' and `gtu' but test for "greater than or equal".
     
`(le X Y)'     
`(leu X Y)'     
     Like `gt' and `gtu' but test for "less than or equal".
     
`(if_then_else COND THEN ELSE)'     
     This is not a comparison operation but is listed here because it is
     always used in conjunction with a comparison operation.  To be
     precise, COND is a comparison expression.  This expression
     represents a choice, according to COND, between the value
     represented by THEN and the one represented by ELSE.
     
     On most machines, `if_then_else' expressions are valid only
     to express conditional jumps.


File: internals  Node: Bit Fields, Prev: Comparisons, Up: RTL, Next: Conversions

Bit-fields
==========

Special expression codes exist to represent bit-field instructions.
These types of expressions are lvalues in rtl; they may appear
on the left side of a assignment, indicating insertion of a value
into the specified bit field.

`(sign_extract:SI LOC SIZE POS)'     
     This represents a reference to a sign-extended bit-field contained or
     starting in LOC (a memory or register reference).  The bit field
     is SIZE bits wide and starts at bit POS.  The compilation
     switch `BITS_BIG_ENDIAN' says which end of the memory unit
     POS counts from.
     
     Which machine modes are valid for LOC depends on the machine,
     but typically LOC should be a single byte when in memory
     or a full word in a register.
     
`(zero_extract:SI LOC POS SIZE)'     
     Like `sign_extract' but refers to an unsigned or zero-extended
     bit field.  The same sequence of bits are extracted, but they
     are filled to an entire word with zeros instead of by sign-extension.


File: internals  Node: Conversions, Prev: Bit Fields, Up: RTL, Next: RTL Declarations

Conversions
===========

All conversions between machine modes must be represented by
explicit conversion operations.  For example, an expression
which the sum of a byte and a full word cannot be written as
`(plus:SI (reg:QI 34) (reg:SI 80))' because the `plus'
operation requires two operands of the same machine mode.
Therefore, the byte-sized operand is enclosed in a conversion
operation, as in

     (plus:SI (sign_extend:SI (reg:QI 34)) (reg:SI 80))

The conversion operation is not a mere placeholder, because there
may be more than one way of converting from a given starting mode
to the desired final mode.  The conversion operation code says how
to do it.

`(sign_extend:M X)'     
     Represents the result of sign-extending the value X
     to machine mode M.  M must be a fixed-point mode
     and X a fixed-point value of a mode narrower than M.
     
`(zero_extend:M X)'     
     Represents the result of zero-extending the value X
     to machine mode M.  M must be a fixed-point mode
     and X a fixed-point value of a mode narrower than M.
     
`(float_extend:M X)'     
     Represents the result of extending the value X
     to machine mode M.  M must be a floating point mode
     and X a floating point value of a mode narrower than M.
     
`(truncate:M X)'     
     Represents the result of truncating the value X
     to machine mode M.  M must be a fixed-point mode
     and X a fixed-point value of a mode wider than M.
     
`(float_truncate:M X)'     
     Represents the result of truncating the value X
     to machine mode M.  M must be a floating point mode
     and X a floating point value of a mode wider than M.
     
`(float:M X)'     
     Represents the result of converting fixed point value X
     to floating point mode M.
     
`(fix:M X)'     
     Represents the result of converting floating point value X
     to fixed point mode M.  How rounding is done is not specified.
     


File: internals  Node: RTL Declarations, Prev: Conversions, Up: RTL, Next: Side Effects

Declarations
============

Declaration expression codes do not represent arithmetic operations
but rather state assertions about their operands.

`(volatile:M X)'     
     Represents the same value X does, but makes the assertion
     that it should be treated as a volatile value.  This forbids
     coalescing multiple accesses or deleting them even if it would
     appear to have no effect on the program.  X must be a `mem'
     expression with mode M.
     
     The first thing the reload pass does to an insn is to remove all
     `volatile' expressions from it; each one is replaced by its
     operand.
     
     Recognizers will never recognize anything with `volatile' in it.
     This automatically prevents some optimizations on such things
     (such as instruction combination).  After the reload pass removes
     all volatility information, the insns can be recognized.
     
     Cse removes `volatile' from destinations of `set''s, because
     no optimizations reorder such `set's.  This is not required for
     correct code and is done to permit some optimization on the value to
     be stored.
     
`(unchanging:M X)'     
     Represents the same value X does, but makes the assertion
     that its value is effectively constant during the execution
     of the current function.  This permits references to X
     to be moved freely within the function.  X must be a `reg'
     expression with mode M.
     
`(strict_low_part (subreg:M (reg:N R) 0))'     
     This expression code is used in only one context: operand 0 of a
     `set' expression.  In addition, the operand of this expression
     must be a `subreg' expression.
     
     The presence of `strict_low_part' says that the part of the
     register which is meaningful in mode N but is not part of
     mode M is not to be altered.  Normally, an assignment to such
     a subreg is allowed to have undefined effects on the rest of the
     register when M is less than a word.


File: internals  Node: Side Effects, Prev: RTL Declarations, Up: RTL, Next: Incdec

Side Effect Expressions
=======================

The expression codes described so far represent values, not actions.
But machine instructions never produce values; they are meaningful
only for their side effects on the state of the machine.  Special
expression codes are used to represent side effects.

The body of an instruction is always one of these side effect codes;
the codes described above, which represent values, appear only as
the operands of these.

`(set LVAL X)'     
     Represents the action of storing the value of X into the place
     represented by LVAL.  LVAL must be an expression
     representing a place that can be stored in: `reg' (or
     `subreg' or `strict_low_part'), `mem', `pc' or
     `cc0'.
     
     If LVAL is a `reg', `subreg' or `mem', it has a
     machine mode; then X must be valid for that mode.
     
     If LVAL is a `reg' whose machine mode is less than the full
     width of the register, then it means that the part of the register
     specified by the machine mode is given the specified value and the
     rest of the register receives an undefined value.  Likewise, if
     LVAL is a `subreg' whose machine mode is narrower than
     `SImode', the rest of the register can be changed in an undefined way.
     
     If LVAL is a `strict_low_part' of a `subreg', then the
     part of the register specified by the machine mode of the
     `subreg' is given the value X and the rest of the register
     is not changed.
     
     If LVAL is `(cc0)', it has no machine mode, and X may
     have any mode.  This represents a "test" or "compare" instruction.
     
     If LVAL is `(pc)', we have a jump instruction, and the
     possibilities for X are very limited.  It may be a
     `label_ref' expression (unconditional jump).  It may be an
     `if_then_else' (conditional jump), in which case either the
     second or the third operand must be `(pc)' (for the case which
     does not jump) and the other of the two must be a `label_ref'
     (for the case which does jump).  X may also be a `mem' or
     `(plus:SI (pc) Y)', where Y may be a `reg' or a
     `mem'; these unusual patterns are used to represent jumps through
     branch tables.
     
`(return)'     
     Represents a return from the current function, on machines where
     this can be done with one instruction, such as Vaxen.  On machines
     where a multi-instruction "epilogue" must be executed in order
     to return from the function, returning is done by jumping to a
     label which precedes the epilogue, and the `return' expression
     code is never used.
     
`(call FUNCTION NARGS)'     
     Represents a function call.  FUNCTION is a `mem' expression
     whose address is the address of the function to be called.  NARGS
     is an expression representing the number of words of argument.
     
     Each machine has a standard machine mode which FUNCTION must
     have.  The machine descripion defines macro `FUNCTION_MODE' to
     expand into the requisite mode name.  The purpose of this mode is to
     specify what kind of addressing is allowed, on machines where the
     allowed kinds of addressing depend on the machine mode being
     addressed.
     
`(clobber X)'     
     Represents the storing or possible storing of an unpredictable,
     undescribed value into X, which must be a `reg' or
     `mem' expression.
     
     One place this is used is in string instructions that store standard
     values into particular hard registers.  It may not be worth the
     trouble to describe the values that are stored, but it is essential
     to inform the compiler that the registers will be altered, lest it
     attempt to keep data in them across the string instruction.
     
     X may also be null---a null C pointer, no expression at all.
     Such a `(clobber (null))' expression means that all memory
     locations must be presumed clobbered.
     
     Note that the machine description classifies certain hard registers as
     "call-clobbered".  All function call instructions are assumed by
     default to clobber these registers, so there is no need to use
     `clobber' expressions to indicate this fact.  Also, each function
     call is assumed to have the potential to alter any memory location.
     
`(use X)'     
     Represents the use of the value of X.  It indicates that
     the value in X at this point in the program is needed,
     even though it may not be apparent whythis is so.  Therefore, the
     compiler will not attempt to delete instructions whose only
     effect is to store a value in X.  X must be a `reg'
     expression.
     
`(parallel [X0 X1 ...])'     
     Represents several side effects performed in parallel.  The square
     brackets stand for a vector; the operand of `parallel' is a
     vector of expressions.  X0, X1 and so on are individual
     side effects---expressions of code `set', `call',
     `return', `clobber' or `use'.
     
     "In parallel" means that first all the values used in
     the individual side-effects are computed, and second all the actual
     side-effects are performed.  For example,
     
          (parallel [(set (reg:SI 1) (mem:SI (reg:SI 1)))
                     (set (mem:SI (reg:SI 1)) (reg:SI 1))])
     
     says unambiguously that the values of hard register 1 and the memory
     location addressed by it are interchanged.  In both places where
     `(reg:SI 1)' appears as a memory address it refers to the value
     in register 1 before the execution of the instruction.

Three expression codes appear in place of a side effect, as the body
of an insn, though strictly speaking they do not describe side effects
as such:

`(asm_input S)'     
     Represents literal assembler code as described by the string S.
     
`(addr_vec:M [LR0 LR1 ...])'     
     Represents a table of jump addresses.  LR0 etc. are
     `label_ref' expressions.  The mode M specifies how much
     space is given to each address; normally M would be
     `Pmode'.
     
`(addr_diff_vec:M BASE [LR0 LR1 ...])'     
     Represents a table of jump addresses expressed as offsets from
     BASE.  LR0 etc. are `label_ref' expressions and so is
     BASE.  The mode M specifies how much space is given to
     each address-difference.


File: internals  Node: Incdec, Prev: Side Effects, Up: RTL, Next: Insns

Embedded Side-Effects on Addresses
==================================

Four special side-effect expression codes appear as memory addresses.

`(pre_dec:M X)'     
     Represents the side effect of decrementing X by a standard
     amount and represents also the value that X has after being
     decremented.  X must be a `reg' or `mem', but most
     machines allow only a `reg'.  M must be the machine mode
     for pointers on the machine in use.  The amount X is decrement
     by is the length in bytes of the machine mode of the containing memory
     reference of which this expression serves as the address.  Here is an
     example of its use:
     
          (mem:DF (pre_dec:SI (reg:SI 39)))
     
     This says to decrement pseudo register 39 by the length of a `DFmode'
     value and use the result to address a `DFmode' value.
     
`(pre_inc:M X)'     
     Similar, but specifies incrementing X instead of decrementing it.
     
`(post_dec:M X)'     
     Represents the same side effect as `pre_decrement' but a different
     value.  The value represented here is the value X has before
     being decremented.
     
`(post_inc:M X)'     
     Similar, but specifies incrementing X instead of decrementing it.

These embedded side effect expressions must be used with care.  Instruction
patterns may not use them.  Until the `flow' pass of the compiler,
they may occur only to represent pushes onto the stack.  The `flow'
pass finds cases where registers are incremented or decremented in one
instruction and used as an address shortly before or after; these cases are
then transformed to use pre- or post-increment or -decrement.

Explicit popping of the stack could be represented with these embedded
side effect operators, but that would not be safe; the instruction
combination pass could move the popping past pushes, thus changing
the meaning of the code.

An instruction that can be represented with an embedded side effect
could also be represented using `parallel' containing an additional
`set' to describe how the address register is altered.  This is not
done because machines that allow these operations at all typically
allow them wherever a memory address is called for.  Describing them as
additional parallel stores would require doubling the number of entries
in the machine description.


File: internals  Node: Insns, Prev: Incdec, Up: RTL, Next: Sharing

Insns
=====

The RTL representation of the code for a function is a doubly-linked
chain of objects called "insns".  Insns are expressions with
special codes that are used for no other purpose.  Some insns are
actual instructions; others represent dispatch tables for `switch'
statements; others represent labels to jump to or various sorts of
declaratory information.

In addition to its own specific data, each insn must have a unique id number
that distinguishes it from all other insns in the current function, and
chain pointers to the preceding and following insns.  These three fields
occupy the same position in every insn, independent of the expression code
of the insn.  They could be accessed with `XEXP' and `XINT',
but instead three special macros are always used:

`INSN_UID (I)'     
     Accesses the unique id of insn I.
     
`PREV_INSN (I)'     
     Accesses the chain pointer to the insn preceding I.
     If I is the first insn, this is a null pointer.
     
`NEXT_INSN (I)'     
     Accesses the chain pointer to the insn following I.
     If I is the last insn, this is a null pointer.

The `NEXT_INSN' and `PREV_INSN' pointers must always
correspond: if I is not the first insn,

     NEXT_INSN (PREV_INSN (INSN)) == INSN

is always true.

Every insn has one of the following six expression codes:

`insn'     
     The expression code `insn' is used for instructions that do not jump
     and do not do function calls.  Insns with code `insn' have four
     additional fields beyond the three mandatory ones listed above.
     These four are described in a table below.
     
`jump_insn'     
     The expression code `jump_insn' is used for instructions that may jump
     (or, more generally, may contain `label_ref' expressions).
     `jump_insn' insns have the same extra fields as `insn' insns,
     accessed in the same way.
     
`call_insn'     
     The expression code `call_insn' is used for instructions that may do
     function calls.  It is important to distinguish these instructions because
     they imply that certain registers and memory locations may be altered
     unpredictably.
     
     `call_insn' insns have the same extra fields as `insn' insns,
     accessed in the same way.
     
`code_label'     
     A `code_label' insn represents a label that a jump insn can jump to.
     It contains one special field of data in addition to the three standard ones.
     It is used to hold the "label number", a number that identifies this
     label uniquely among all the labels in the compilation (not just in the
     current function).  Ultimately, the label is represented in the assembler
     output as an assembler label `LN' where N is the label number.
     
`barrier'     
     Barriers are placed in the instruction stream after unconditional
     jump instructions to indicate that the jumps are unconditional.
     They contain no information beyond the three standard fields.
     
`note'     
     `note' insns are used to represent additional debugging and
     declaratory information.  They contain two nonstandard fields, an
     integer which is accessed with the macro `NOTE_LINE_NUMBER' and a
     string accessed with `NOTE_SOURCE_FILE'.
     
     If `NOTE_LINE_NUMBER' is positive, the note represents the
     position of a source line and `NOTE_SOURCE_FILE' is the source file name
     that the line came from.  These notes control generation of line
     number data in the assembler output.
     
     Otherwise, `NOTE_LINE_NUMBER' is not really a line number but a
     code with one of the following values (and `NOTE_SOURCE_FILE'
     must contain a null pointer):
     
     `NOTE_INSN_DELETED'     
          Such a note is completely ignorable.  Some passes of the compiler
          delete insns by altering them into notes of this kind.
          
     `NOTE_INSN_BLOCK_BEG'     
     `NOTE_INSN_BLOCK_END'     
          These types of notes indicate the position of the beginning and end
          of a level of scoping of variable names.  They control the output
          of debugging information.
          
     `NOTE_INSN_LOOP_BEG'     
     `NOTE_INSN_LOOP_END'     
          These types of notes indicate the position of the beginning and end
          of a `while' or `for' loop.  They enable the loop optimizer
          to find loops quickly.

Here is a table of the extra fields of `insn', `jump_insn'
and `call_insn' insns:

`PATTERN (I)'     
     An expression for the side effect performed by this insn.
     
`REG_NOTES (I)'     
     A list (chain of `expr_list' expressions) giving information
     about the usage of registers in this insn.  This list is set up by the
     `flow' pass; it is a null pointer until then.
     
`LOG_LINKS (I)'     
     A list (chain of `insn_list' expressions) of previous "related"
     insns: insns which store into registers values that are used for the
     first time in this insn.  (An additional constraint is that neither a
     jump nor a label may come between the related insns).  This list is
     set up by the `flow' pass; it is a null pointer until then.
     
`INSN_CODE (I)'     
     An integer that says which pattern in the machine description matches
     this insn, or -1 if the matching has not yet been attempted.
     
     Such matching is never attempted and this field is not used on an insn
     whose pattern consists of a single `use', `clobber',
     `asm', `addr_vec' or `addr_diff_vec' expression.

The `LOG_LINKS' field of an insn is a chain of `insn_list'
expressions.  Each of these has two operands: the first is an insn,
and the second is another `insn_list' expression (the next one in
the chain).  The last `insn_list' in the chain has a null pointer
as second operand.  The significant thing about the chain is which
insns apepar in it (as first operands of `insn_list'
expressions).  Their order is not significant.

The `REG_NOTES' field of an insn is a similar chain but of
`expr_list' expressions instead of `insn_list'.  The first
operand is a `reg' rtx.  Its presence in the list can have three
possible meanings, distinguished by a value that is stored in the
machine-mode field of the `expr_list' because that is a
conveniently available space, but that is not really a machine mode.
These values belong to the C type `enum reg_note' and there are
three of them:

`REG_DEAD'     
     The `reg' listed dies in this insn; that is to say, altering
     the value immediately after this insn would not affect the future
     behavior of the program.
     
`REG_INC'     
     The `reg' listed is incremented (or decremented; at this level
     there is no distinction) by an embedded side effect inside this insn.
     
`REG_CONST'     
     The `reg' listed has a value that could safely be replaced
     everywhere by the value that this insn copies into it.  ("Safety"
     here refers to the data flow of the program; such replacement may
     require reloading into registers for some of the insns in which
     the `reg' is replaced.)
     
`REG_WAS_0'     
     The `reg' listed contained zero before this insn.  You can rely
     on this note if it is present; its absence implies nothing.

(The only difference between the expression codes `insn_list' and
`expr_list' is that the first operand of an `insn_list' is
assumed to be an insn and is printed in debugging dumps as the insn's
unique id; the first operand of an `expr_list' is printed in the
ordinary way as an expression.)


File: internals  Node: Sharing, Prev: Insns, Up: RTL

Structure Sharing Assumptions
=============================

The compiler assumes that certain kinds of RTL expressions are unique;
there do not exist two distinct objects representing the same value.
In other cases, it makes an opposite assumption: that no RTL expression
object of a certain kind appears in more than one place in the
containing structure.

These assumptions refer to a single function; except for the RTL
objects that describe global variables and external functions,
no RTL objects are common to two functions.

   * Each pseudo-register has only a single `reg' object to represent it,
     and therefore only a single machine mode.
     
   * For any symbolic label, there is only one `symbol_ref' object
     referring to it.
     
   * There is only one `const_int' expression with value zero,
     and only one with value one.
     
   * There is only one `pc' expression.
     
   * There is only one `cc0' expression.
     
   * There is only one `const_double' expression with mode
     `SFmode' and value zero, and only one with mode `DFmode' and
     value zero.
     
   * No `label_ref' appears in more than one place in the RTL structure;
     in other words, it is safe to do a tree-walk of all the insns in the function
     and assume that each time a `label_ref' is seen it is distinct from all
     other `label_refs' seen.
     
   * Aside from the cases listed above, the only kind of expression
     object that may appear in more than one place is the `mem'
     object that describes a stack slot or a static variable.


File: internals  Node: Machine Desc, Prev: RTL, Up: Top, Next: Machine Macros

Machine Descriptions
********************

A machine description has two parts: a file of instruction patterns
(`.md' file) and a C header file of macro definitions.

The `.md' file for a target machine contains a pattern for each
instruction that the target machine supports (or at least each instruction
that is worth telling the compiler about).  It may also contain comments.
A semicolon causes the rest of the line to be a comment, unless the semicolon
is inside a quoted string.

See the next chapter for information on the C header file.

* Menu:

* Patterns::            How to write instruction patterns.
* Example::             Example of an instruction pattern.
* Constraints::         When not all operands are general operands.
* Standard Names::      Names mark patterns to use for code generation.
* Dependent Patterns::  Having one pattern may make you need another.


File: internals  Node: Patterns, Prev: Machine Desc, Up: Machine Desc, Next: Example

Instruction Patterns
====================

Each instruction pattern contains an incomplete RTL expression, with pieces
to be filled in later, operand constraints that restrict how the pieces can
be filled in, and an output pattern or C code to generate the assembler
output, all wrapped up in a `define_insn' expression.

Sometimes an insn can match more than one instruction pattern.  Then the
pattern that appears first in the machine description is the one used.
Therefore, more specific patterns should usually go first in the
description.

The `define_insn' expression contains four operands:

  1. An optional name.  The presence of a name indicate that this instruction
     pattern can perform a certain standard job for the RTL-generation
     pass of the compiler.  This pass knows certain names and will use
     the instruction patterns with those names, if the names are defined
     in the machine description.
     
     The absence of a name is indicated by writing an empty string
     where the name should go.  Nameless instruction patterns are never
     used for generating RTL code, but they may permit several simpler insns
     to be combined later on.
     
     Names that are not thus known and used in RTL-generation have no
     effect; they are equivalent to no name at all.
     
  2. The recognition template.  This is a vector of incomplete RTL
     expressions which show what the instruction should look like.  It is
     incomplete because it may contain `match_operand' and
     `match_dup' expressions that stand for operands of the
     instruction.
     
     If the vector has only one element, that element is what the
     instruction should look like.  If the vector has multiple elements,
     then the instruction looks like a `parallel' expression
     containing that many elements as described.
     
  3. A condition.  This is a string which contains a C expression that is
     the final test to decide whether an insn body matches this pattern.
     
     For a named pattern, the condition (if present) may not depend on
     the data in the insn being matched, but only the target-machine-type
     flags.  The compiler needs to test these conditions during
     initialization in order to learn exactly which named instructions are
     available in a particular run.
     
     For nameless patterns, the condition is applied only when matching an
     individual insn, and only after the insn has matched the pattern's
     recognition template.  The insn's operands may be found in the vector
     `operands'.
     
  4. A string that says how to output matching insns as assembler code.  In
     the simpler case, the string is an output template, much like a
     `printf' control string.  `%' in the string specifies where
     to insert the operands of the instruction; the `%' is followed by
     a single-digit operand number.
     
     `%cDIGIT' can be used to subtitute an operand that is a
     constant value without the syntax that normally indicates an immediate
     operand.
     
     `%aDIGIT' can be used to substitute an operand as if it
     were a memory reference, with the actual operand treated as the address.
     This may be useful when outputting a "load address" instruction,
     because often the assembler syntax for such an instruction requires
     you to write the operand as if it were a memory reference.
     
     The template may generate multiple assembler instructions.
     Write the text for the instructions, with `\;' between them.
     
     If the output control string starts with a `*', then it is not an
     output template but rather a piece of C program that should compute a
     template.  It should execute a `return' statement to return the
     template-string you want.  Most such templates use C string literals,
     which require doublequote characters to delimit them.  To include
     these doublequote characters in the string, prefix each one with
     `\'.
     
     The operands may be found in the array `operands', whose C
     data type is `rtx []'.
     
     It is possible to output an assembler instruction and then go on to
     output or compute more of them, using the subroutine
     `output_asm_insn'.  This receives two arguments: a
     template-string and a vector of operands.  The vector may be
     `operands', or it may be another array of `rtx' that you
     declare locally and initialize yourself.

The recognition template is used also, for named patterns, for
constructing insns.  Construction involves substituting specified
operands into a copy of the template.  Matching involves determining
the values that serve as the operands in the insn being matched.  Both
of these activities are controlled by two special expression types
that direct matching and substitution of the operands.

`(match_operand:M N TESTFN CONSTRAINT)'     
     This expression is a placeholder for operand number N of
     the insn.  When constructing an insn, operand number N
     will be substituted at this point.  When matching an insn, whatever
     appears at this position in the insn will be taken as operand
     number N; but it must satisfy TESTFN or this instruction
     pattern will not match at all.
     
     Operand numbers must be chosen consecutively counting from zero in
     each instruction pattern.  There may be only one `match_operand'
     expression in the pattern for each expression number, and they must
     appear in order of increasing expression number.
     
     TESTFN is a string that is the name of a C function that accepts
     two arguments, a machine mode and an expression.  During matching,
     the function will be called with M as the mode argument
     and the putative operand as the other argument.  If it returns zero,
     this instruction pattern fails to match.  TESTFN may be
     an empty string; then it means no test is to be done on the operand.
     
     Most often, TESTFN is `"general_operand"'.  It checks
     that the putative operand is either a constant, a register or a
     memory reference, and that it is valid for mode M.
     
     CONSTRAINT is explained later.
     
`(match_dup N)'     
     This expression is also a placeholder for operand number N.
     It is used when the operand needs to appear more than once in the
     insn.
     
     In construction, `match_dup' behaves exactly like
     MATCH_OPERAND: the operand is substituted into the insn being
     constructed.  But in matching, `match_dup' behaves differently.
     It assumes that operand number N has already been determined by
     a `match_operand' apparing earlier in the recognition template,
     and it matches only an identical-looking expression.
     
`(address (match_operand:M N "address_operand" ""))'     
     This complex of expressions is a placeholder for an operand number
     N in a "load address" instruction: an operand which specifies
     a memory location in the usual way, but for which the actual operand
     value used is the address of the location, not the contents of the
     location.
     
     `address' expressions never appear in RTL code, only in machine
     descriptions.  And they are used only in machine descriptions that do
     not use the operand constraint feature.  When operand constraints are
     in use, the letter `p' in the constraint serves this purpose.
     
     M is the machine mode of the *memory location being
     addressed*, not the machine mode of the address itself.  That mode is
     always the same on a given target machine (it is `Pmode', which
     normally is `SImode'), so there is no point in mentioning it;
     thus, no machine mode is written in the `address' expression.  If
     some day support is added for machines in which addresses of different
     kinds of objects appear differently or are used differently (such as
     the PDP-10), different formats would perhaps need different machine
     modes and these modes might be written in the `address'
     expression.


File: internals  Node: Example, Prev: Patterns, Up: Machine Desc, Next: Constraints

Example of `define_insn'
========================

Here is an actual example of an instruction pattern, for the 68000/68020.

     (define_insn "tstsi"
       [(set (cc0)
     	(match_operand:SI 0 "general_operand" "rm"))]
       ""
       "*
     { if (TARGET_68020 || ! ADDRESS_REG_P (operands[0]))
         return \"tstl %0\";
       return \"cmpl #0,%0\"; }")

This is an instruction that sets the condition codes based on the value of
a general operand.  It has no condition, so any insn whose RTL description
has the form shown may be handled according to this pattern.  The name
`tstsi' means "test a `SImode' value" and tells the RTL generation
pass that, when it is necessary to test such a value, an insn to do so
can be constructed using this pattern.

The output control string is a piece of C code which chooses which
output template to return based on the kind of operand and the specific
type of CPU for which code is being generated.

`"rm"' is an operand constraint.  Its meaning is explained below.


File: internals  Node: Constraints, Prev: Example, Up: Machine Desc, Next: Standard Names

Operand Constraints
===================

Each `match_operand' in an instruction pattern can specify a
constraint for the type of operands allowed.  Constraints can say whether
an operand may be in a register, and which kinds of register; whether the
operand can be a memory reference, and which kinds of address; whether the
operand may be an immediate constant, and which possible values it may
have.  Constraints can also require two operands to match.

* Menu:

* Simple Constraints::  Basic use of constraints.
* Multi-alternative::   When an insn has two alternative constraint-patterns.
* Class Preferences::   Constraints guide which hard register to put things in.
* Modifiers::           More precise control over effects of constraints.
* No Constraints::      Describing a clean machine without constraints.


File: internals  Node: Simple Constraints, Prev: Constraints, Up: Constraints, Next: Multi-Alternative

Simple Constraints
------------------

The simplest kind of constraint is a string full of letters, each of
which describes one kind of operand that is permitted.  Here are
the letters that are allowed:

`m'     
     A memory operand is allowed, with any kind of address that the machine
     supports in general.
     
`o'     
     A memory operand is allowed, but only if the address is "offsetable".
     This means that adding a small integer (actually, the width in bytes of the
     operand, as determined by its machine mode) may be added to the address
     and the result is also a valid memory address.  For example, an address
     which is constant is offsetable; so is an address that is the sum of
     a register and a constant (as long as a slightly larger constant is also
     within the range of address-offsets supported by the machine); but an
     autoincrement or autodecrement address is not offsetable.  More complicated
     indirect/indexed addresses may or may not be offsetable depending on the
     other addressing modes that the machine supports.
     
`<'     
     A memory operand with autodecrement addressing (either predecrement or
     postdecrement) is allowed.
     
`>'     
     A memory operand with autoincrement addressing (either preincrement or
     postincrement) is allowed.
     
`r'     
     A register operand is allowed provided that it is in a general register.
     
`d'     
`a'     
`f'     
`...'     
     Other letters can be defined in machine-dependent fashion to stand for
     particular classes of registers.  `d', `a' and `f' are
     defined on the 68000/68020 to stand for data, address and floating point
     registers.
     
`i'     
     An immediate integer operand (one with constant value) is allowed.
     
`I'     
`J'     
`K'     
`...'     
     Other letters in the range `I' through `M' may be defined in a
     machine-dependent fashion to permit immediate integer operands with
     explicit integer values in specified ranges.  For example, on the 68000,
     `I' is defined to stand for the range of values 1 to 8.  This is the
     range permitted as a shift count in the shift instructions.
     
`F'     
     An immediate floating operand (expression code `const_double') is
     allowed.
     
`G'     
`H'     
     `G' and `H' may be defined in a machine-dependent fashion to
     permit immediate floating operands in particular ranges of values.
     
`s'     
     An immediate integer operand whose value is not an explicit integer is
     allowed.  This might appear strange; if an insn allows a constant operand
     with a value not known at compile time, it certainly must allow any known
     value.  So why use `s' instead of `i'?  Sometimes it allows
     better code to be generated.  For example, on the 68000 in a fullword
     instruction it is possible to use an immediate operand; but if the
     immediate value is between -32 and 31, better code results from loading the
     value into a register and using the register.  This is because the load
     into the register can be done with a `moveq' instruction.  We arrange
     for this to happen by defining the letter `K' to mean "any integer
     outside the range -32 to 31", and then specifying `Ks' in the operand
     constraints.
     
`g'     
     Any register, memory or immediate integer operand is allowed, except for
     registers that are not general registers.
     
`N, a digit'     
     An operand identical to operand number N is allowed.
     If a digit is used together with letters, the digit should come last.
     
`p'     
     An operand that is a valid memory address is allowed.  This is
     for "load address" and "push address" instructions.
     
     If `p' is used in the constraint, the test-function in the
     `match_operand' must be `address_operand'.

In order to have valid assembler code, each operand must satisfy
its constraint.  But a failure to do so does not prevent the pattern
from applying to an insn.  Instead, it directs the compiler to modify
the code such that the constraint will be satisfied.  Usually this is
done by copying an operand into a register.

Contrast, therefore, the two instruction patterns that follow:

     (define_insn ""
       [(set (match_operand:SI 0 "general_operand" "r")
             (plus:SI (match_dup 0)
                      (match_operand:SI 1 "general_operand" "r")))]
       ""
       "...")

which has two operands, one of which must appear in two places, and

     (define_insn ""
       [(set (match_operand:SI 0 "general_operand" "r")
             (plus:SI (match_operand:SI 1 "general_operand" "0")
                      (match_operand:SI 2 "general_operand" "r")))]
       ""
       "...")

which has three operands, two of which are required by a constraint to be
identical.  If we are considering an insn of the form

     (insn N PREV NEXT
       (set (reg:SI 3)
            (plus:SI (reg:SI 6) (reg:SI 109)))
       ...)

the first pattern would not apply at all, because this insn does not
contain two identical subexpressions in the right place.  The pattern would
say, "That does not look like an add instruction; try other patterns."
The second pattern would say, "Yes, that's an add instruction, but there
is something wrong with it."  It would direct the reload pass of the
compiler to generate additional insns to make the constraint true.  The
results might look like this:

     (insn N2 PREV N
       (set (reg:SI 3) (reg:SI 6))
       ...)
     
     (insn N N2 NEXT
       (set (reg:SI 3)
            (plus:SI (reg:SI 3) (reg:SI 109)))
       ...)

Because insns that don't fit the constraints are fixed up by loading
operands into registers, every instruction pattern's constraints must
permit the case where all the operands are in registers.  It need not
permit all classes of registers; the compiler knows how to copy registers
into other registers of the proper class in order to make an instruction
valid.  But if no registers are permitted, the compiler will be stymied: it
does not know how to save a register in memory in order to make an
instruction valid.  Instruction patterns that reject registers can be
made valid by attaching a condition-expression that refuses to match
an insn at all if the crucial operand is a register.


File: internals  Node: Multi-Alternative, Prev: Simple Constraints, Up: Constraints, Next: Class Preferences

Multiple Alternative Constraints
--------------------------------

Sometimes a single instruction has multiple alternative sets of possible
operands.  For example, on the 68000, a logical-or instruction can combine
register or an immediate value into memory, or it can combine any kind of
operand into a register; but it cannot combine one memory location into
another.

These constraints are represented as multiple alternatives.  An alternative
can be described by a series of letters for each operand.  The overall
constraint for an operand is made from the letters for this operand
from the first alternative, a comma, the letters for this operand from
the second alternative, a comma, and so on until the last alternative.
Here is how it is done for fullword logical-or on the 68000:

     (define_insn "iorsi3"
       [(set (match_operand:SI 0 "general_operand" "=%m,d")
     	(ior:SI (match_operand:SI 1 "general_operand" "0,0")
     		(match_operand:SI 2 "general_operand" "dKs,dmKs")))]
       ...)

The first alternative has `m' (memory) for operand 0, `0' for
operand 1 (meaning it must match operand 0), and `dKs' for operand 2.
The second alternative has `d' (data register) for operand 0, `0'
for operand 1, and `dmKs' for operand 2.  The `=' and `%' in
the constraint for operand 0 are not part of any alternative; their meaning
is explained in the next section.

If all the operands fit any one alternative, the instruction is valid.
Otherwise, for each alternative, the compiler counts how many instructions
must be added to copy the operands so that that alternative applies.
The alternative requiring the least copying is chosen.  If two alternatives
need the same amount of copying, the one that comes first is chosen.
These choices can be altered with the `?' and `!' characters:

`?'     
     Disparage slightly the alternative that the `?' appears in,
     as a choice when no alternative applies exactly.  The compiler regards
     this alternative as one unit more costly for each `?' that appears
     in it.
     
`!'     
     Disparage severely the alternative that the `!' appears in.
     When operands must be copied into registers, the compiler will
     never choose this alternative as the one to strive for.


File: internals  Node: Class Preferences, Prev: Multi-Alternative, Up: Constraints, Next: Modifiers

Register Class Preferences
--------------------------

The operand constraints have another function: they enable the compiler
to decide which kind of hardware register a pseudo register is best
allocated to.  The compiler examines the constraints that apply to the
insns that use the pseudo register, looking for the machine-dependent
letters such as `d' and `a' that specify classes of registers.
The pseudo register is put in whichever class gets the most "votes".
The constraint letters `g' and `r' also vote: they vote in
favor of a general register.  The machine description says which registers
are considered general.

Of course, on some machines all registers are equivalent, and no register
classes are defined.  Then none of this complexity is relevant.


File: internals  Node: Modifiers, Prev: Class Preferences, Up: Constraints, Next: No Constraints

Constraint Modifier Characters
------------------------------

`='     
     Means that this operand is written by the instruction, but its previous
     value is not used.
     
`+'     
     Means that this operand is both read and written by the instruction.
     
     When the compiler fixes up the operands to satisfy the constraints,
     it needs to know which operands are inputs to the instruction and
     which are outputs from it.  `=' identifies an output; `+'
     identifies an operand that is both input and output; all other operands
     are assumed to be input only.
     
`%'     
     Declares the instruction to be commutative for operands 1 and 2.
     This means that the compiler may interchange operands 1 and 2
     if that will make the operands fit their constraints.
     
`#'     
     Says that all following characters, up to the next comma, are to be ignored
     as a constraint.  They are significant only for choosing register preferences.
     
`*'     
     Says that the following character should be ignored when choosing
     register preferences.  `*' has no effect on the meaning of
     the constraint as a constraint.


File: internals  Node: No Constraints, Prev: Modifiers, Up: Constraints

Not Using Constraints
---------------------

Some machines are so clean that operand constraints are not required.  For
example, on the Vax, an operand valid in one context is valid in any other
context.  On such a machine, every operand constraint would be `"g"',
excepting only operands of "load address" instructions which are
written as if they referred to a memory location's contents but actual
refer to its address.  They would have constraint `"p"'.

For such machines, instead of writing `"g"' and `"p"' for all
the constraints, you can choose to write a description with empty constraints.
Then you write `""' for the constraint in every `match_operand'.
Address operands are identified by writing an `address' expression
around the `match_operand', not by their constraints.

When the machine description has just empty constraints, certain parts
of compilation are skipped, making the compiler faster.


File: internals  Node: Standard Names, Prev: Constraints, Up: Machine Desc, Next: Dependent Patterns

Standard Insn Names
===================

Here is a table of the instruction names that are meaningful in the RTL
generation pass of the compiler.  Giving one of these names to an
instruction pattern tells the RTL generation pass that it can use the
pattern in to accomplish a certain task.

`movM'     
     Here M is a two-letter machine mode name, in lower case.  This
     instruction pattern moves data with that machine mode from operand 1 to
     operand 0.  For example, `movsi' moves full-word data.
     
     If operand 0 is a `subreg' with mode M of a register whose
     natural mode is wider than M, the effect of this instruction is
     to store the specified value in the part of the register that corresponds
     to mode M.  The effect on the rest of the register is undefined.
     
`movstrictM'     
     Like `movM' except that if operand 0 is a `subreg'
     with mode M of a register whose natural mode is wider,
     the `movstrictM' instruction is guaranteed not to alter
     any of the register except the part which belongs to mode M.
     
`addM3'     
     Add operand 2 and operand 1, storing the result in operand 0.  All operands
     must have mode M.  This can be used even on two-address machines, by
     means of constraints requiring operands 1 and 0 to be the same location.
     
`subM3'     
`mulM3'     
`umulM3'     
`divM3'     
`udivM3'     
`modM3'     
`umodM3'     
`andM3'     
`iorM3'     
`xorM3'     
     Similar, for other arithmetic operations.
     
`andcbM3'     
     Bitwise logical-and operand 1 with the complement of operand 2
     and store the result in operand 0.
     
`mulhisi3'     
     Multiply operands 1 and 2, which have mode `HImode', and store
     a `SImode' product in operand 0.
     
`mulqihi3'     
`mulsidi3'     
     Similar widening-multiplication instructions of other widths.
     
`umulqihi3'     
`umulhisi3'     
`umulsidi3'     
     Similar widening-multiplication instructions that do unsigned
     multiplication.
     
`divmodM4'     
     Signed division that produces both a quotient and a remainder.
     Operand 1 is divided by operand 2 to produce a quotient stored
     in operand 0 and a remainder stored in operand 3.
     
`udivmodM4'     
     Similar, but does unsigned division.
     
`divmodMN4'     
     Like `divmodM4' except that only the dividend has mode
     M; the divisor, quotient and remainder have mode N.
     For example, the Vax has a `divmoddisi4' instruction
     (but it is omitted from the machine description, because it
     is so slow that it is faster to compute remainders by the
     circumlocution that the compiler will use if this instruction is
     not available).
     
`ashlM3'     
     Arithmetic-shift operand 1 left by a number of bits specified by
     operand 2, and store the result in operand 0.  Operand 2 has
     mode `SImode', not mode M.
     
`ashrM3'     
`lshlM3'     
`lshrM3'     
`rotlM3'     
`rotrM3'     
     Other shift and rotate instructions.
     
`negM2'     
     Negate operand 1 and store the result in operand 0.
     
`absM2'     
     Store the absolute value of operand 1 into operand 0.
     
`sqrtM2'     
     Store the square root of operand 1 into operand 0.
     
`one_cmplM2'     
     Store the bitwise-complement of operand 1 into operand 0.
     
`cmpM'     
     Compare operand 0 and operand 1, and set the condition codes.
     
`tstM'     
     Compare operand 0 against zero, and set the condition codes.
     
`movstrM'     
     Block move instruction.  The addresses of the destination and source
     strings are the first two operands, and both are in mode `Pmode'.
     The number of bytes to move is the third operand, in mode M.
     
`cmpstrM'     
     Block compare instruction, with operands like `movstrM'
     except that the two memory blocks are compared byte by byte
     in lexicographic order.  The effect of the instruction is to set
     the condition codes.
     
`floatMN2'     
     Convert operand 1 (valid for floating point mode M) to fixed
     point mode N and store in operand 0 (which has mode N).
     
`fixMN2'     
     Convert operand 1 (valid for fixed point mode M) to floating
     point mode N and store in operand 0 (which has mode N).
     
`truncMN'     
     Truncate operand 1 (valid for mode M) to mode N and
     store in operand 0 (which has mode N).  Both modes must be fixed
     point or both floating point.
     
`extendMN'     
     Sign-extend operand 1 (valid for mode M) to mode N and
     store in operand 0 (which has mode N).  Both modes must be fixed
     point or both floating point.
     
`zero_extendMN'     
     Zero-extend operand 1 (valid for mode M) to mode N and
     store in operand 0 (which has mode N).  Both modes must be fixed
     point.
     
`extv'     
     Extract a bit-field from operand 1 (a register or memory operand),
     where operand 2 specifies the width in bits and operand 3 the starting
     bit, and store it in operand 0.  Operand 0 must have `Simode'.
     Operand 1 may have mode `QImode' or `SImode'; often
     `SImode' is allowed only for registers.  Operands 2 and 3 must be
     valid for `SImode'.
     
     The RTL generation pass generates this instruction only with constants
     for operands 2 and 3.
     
     The bit-field value is sign-extended to a full word integer
     before it is stored in operand 0.
     
`extzv'     
     Like `extv' except that the bit-field value is zero-extended.
     
`insv'     
     Store operand 3 (which must be valid for `SImode') into a
     bit-field in operand 0, where operand 1 specifies the width in bits
     and operand 2 the starting bit.  Operand 0 may have mode `QImode'
     or `SImode'; often `SImode' is allowed only for registers.
     Operands 1 and 2 must be valid for `SImode'.
     
     The RTL generation pass generates this instruction only with constants
     for operands 1 and 2.
     
`sCONDM'     
     Store zero or -1 in the operand (with mode M) according to the
     condition codes.  Value stored is -1 iff the condition COND is
     true.  COND is the name of a comparison operation rtx code, such
     as `eq', `lt' or `leu'.
     
`bCOND'     
     Conditional branch instruction.  Operand 0 is a `label_ref'
     that refers to the label to jump to.  Jump if the condition codes
     meet condition COND.
     
`call'     
     Subroutine call instruction.  Operand 1 is the number of arguments
     and operand 0 is the function to call.  Operand 1 should be a `mem'
     rtx whose address is the address of the function.
     
`return'     
     Subroutine return instruction.  This instruction pattern name should be
     defined only if a single instruction can do all the work of returning
     from a function.
     
`tablejump'     
`caseM'     

